Efficient Synchronization for a Large-scale Multi-core Chip Architecture

نویسندگان

  • Weirong Zhu
  • Guang R. Gao
چکیده

Multi-core architectures are becoming mainstream, permitting increasing on-chip parallelism through hardware support for multithreading. Synchronization, especial finegrain synchronization, is essential to the effective utilization of the computational power of high-performance large-scale multi-core architectures. However, designing and implementing fine-grain synchronization in such architectures presents several challenges, including issues of synchronization induced overhead, storage cost, scalability, and the level of granularity to which synchronization is applicable. Using the 160-core IBM Cyclops-64 multi-core chip architecture as a case study, this dissertation first presents a thorough performance measurement, evaluation, and customization of a range of widely used synchronization mechanisms. This dissertation then proposes Synchronization State Buffer (SSB), a scalable architectural design for fine-grain synchronization that efficiently enforces word-level mutual exclusion and read-after-write data-dependencies between concurrent threads. The design of SSB is motivated by the following simple observation: at any instance during the parallel execution only a small fraction of memory locations are actively participating in synchronization. Based on this observation we present a fine-grain synchronization design that records and manages the states of frequently synchronized data using modest hardware support. We have implemented SSB design in the context of the IBM Cyclops-64 architecture. Using detailed simulation, the experimental results demonstrate significant performance gain due to the use of SSB-based fine-grain synchronization solution for a set of selected benchmarks with different workload characteristics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Fine-Grain Synchronization on a Multi-Core Chip Architecture: A Fresh Look

Multi-core chip architectures are becoming mainstream, permitting increasing on-chip parallelism through hardware support for multithreading. Fine-grain synchronization is essential to the effective utilization of the capacity provided by future high-performance multi-core architectures. However, there are also new challenges realizing such fine-grain synchronization in large-scale multi-core c...

متن کامل

Design of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems

Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...

متن کامل

A Study of Parallel Betweenness Centrality Algorithm on a Manycore Architecture

Large scale graph analysis algorithms–such as those in SCCA2 benchmarks studied in this paper–play an increasingly important role in high performance computing applications. Different from most of traditional scientific computing applications, graph algorithms often show dynamic and irregular computing behavior. It is difficult to attain good performance on large scale conventional parallel arc...

متن کامل

Area and Performance Optimization of Barrier Synchronization on Multi-core Network-on-Chips

Barrier synchronization is commonly and widely used to synchronize the execution of parallel processor cores on multi-core Network-on-Chips (NoCs). Since its global nature may cause heavy serialization resulting in large performance penalty, barrier synchronization should be carefully designed to have low latency communication and to minimize overall completion time. Therefore, in the paper, we...

متن کامل

An Efficient Architectural Design of Hardware Interface for Heterogeneous Multi-core System

How to manage the message passing among inter processor cores with lower overhead is a great challenge when the multi-core system is the contemporary solution to satisfy high performance and low energy demands in general and embedded computing domains. Generally speaking, the networks-on-chip connects the distributed multi-core system. It takes charge of message passing which including data and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007